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Abstract 

This  paper  presents  algorithms  for  labeling  the  connected  com¬ 
ponents  ot  a  binary  image  using  a  hypercube  or  shuffle-exchange 
computer.  The  algorithms  label  the  components  of  an  N1/2  x 
N1 7  pixel  image  in  G(log2  N)  time  using  a  hypercube  or  shuffle- 
exchange  computer  with  N  processors  and  a  constant  amount 
of  memory  per  processor.  The  algorithms  that  are  presented 
are  the  first  to  solve  this  problem  in  Oflog2  N)  time.  The 
algorithms  are  based  on  a  divide-and-conquer  approach  and 
use  as  a  subroutine  an  Oflog  N)  time  PRAM  algorithm  for 
labeling  the  connected  components  of  a  graph.  The  simulation 
of  the  PRAM  by  thcthypercube  and  shuffle-exchange  computers 
is  particularly  efficient  because  the  PRAM  that  is  being  simu¬ 
lated  has  only  OfN3,4)  processors  and  memory  cells. 

Introduction 

u 

The  problem  of  labeling  the  connected  components  in  a  binary 
image  is  addressed.  This  problem  is  of  great  importance  to  the 
image  processing  community  because  it  forms  a  bridge  between 
low-level  iconic  algorithms  and  high-level  symbolic  ones  (13>t(|7). 
Because  of  its  importance,  a  large  number  of  paraLcl  algorithms 
have  been  developed  for  image  component  labeling 
(3)I(a)t;5)i(S)jP)i(>)i(9)>(!2).(i6).  This  paper  presents  algorithms  for 
image  component  labeling  using  hypercube  and  shuffle- 
exchange  computers.  These  algorithms  are  asymptotically 
faster  than  previously  known  algorithms  for  the  problem  using 
these  types  of  parallel  computers.  The  remainder  of  this  section 
examines  models  of  parallel  computers,  the  image  component 
labeling  problem  and  previous  work  in  the  area.  Section  2 
describes  some  previously  published  algorithms  that  will  be 
used  as  subroutines.  In  Section  3,  the  hypercube  and  shuffle- 
exchange  algorithms  are  presented  and  analysed.  Section  4 
contains  conclusions. 


The  hypercube,  the  shuffle-exchange  and  the  PRAM  are  ail 
models  of  parallel  computers.  All  of  these  models  operate  m 
a  synchronous,  SIMD  manner.  Let  the  processors  m  each 
model  be  numbered  0  ..  N-l.  In  the  hypercube  and  shuiT.e- 
exchange  computers  each  processor  has  a  constant  amount  of 
local  random-access  memory  and  can  communicate  with  other 
processors  through  a  fixed  interconnection  network.  In  the 
hypercube,  processor  i  is  connected  to  processor  j  if  the  binary 
representations  of  i  and  j  differ  in  exactly  1  bit  position.  In 
the  shuffle-exchange,  processor  i  is  connected  to  processor  ;  if 
j  =  Shuffle(i),  j  *  Unshuffle(i)  or  j  =  Exchanged)  where 
Shuffle(i)  =  2i  mod  (N-l),  Unshufile  is  the  inverse  of  Shuffle, 
and  Exchange(i)  =  i+ 1  -  2(i  mod  2)  (|5f  A  hypercube  or 
shuffle-exchange  of  size  N  is  a  hypercube  or  shuffle-exchange 
computer  with  N  processors. 

In  the  PRAM,  all  processors  have  access  to  a  common 
memory.  In  a  single  time  step,  all  processors  can  read  from  or 
write  to  the  common  memory.  A  PRAM  of  size  N  is  a  PRAM 
computer  with  N  processors  and  N  w  ords  of  memory  .  Variants 
of  the  PRAM  model  differ  in  allowing  multiple  processors  si¬ 
multaneous  access  to  a  single  memory  location.  The  most 
powerful  type  of  PRAM  is  the  CRCW  (concurrent  read,  con¬ 
current  write)  PRAM.  In  this  model,  any  number  of  processors 
may  simultaneously  read  from  or  write  to  a  single  memory- 
location.  When  more  than  one  processor  tries  to  read  from  a 
single  memory  location,  all  of  them  succeed.  When  more  than 
one  processor  tries  to  write  to  a  single  memory  location,  one 
of  them  succeeds.  The  selection  of  which  processui 
depends  on  which  type  of  CRCW  PRAM  model  is  being  used. 

The  input  to  the  image  component  labeling  problem  is  an 
N17  x  N1  7  array  of  binary  pixels.  Two  1-valued  pixels  are 
adjacent  if  they  share  a  vertical  or  horizontal  edge,  and  tr.ey 
are  connected  if  there  exists  a  path  of  adjacent  1 -valued  pixels 
from  one  to  the  other.  The  image  component  labeln  ,  piuulsm 


is  to  label  each  ! -valued  pixel  such  that  any  two  1 -valued  pixels 
receive  the  same  label  if  and  only  if  they  are  connected.  A  set 
of  pixels  that  must  receive  the  same  label  is  a  connected  com¬ 
ponent  of  the  image. 

The  image  component  labeling  problem  is  a  special  case  of 
the  graph  component  labeling  problem.  Specifically,  given  a 
binary  image  1,  create  the  corresponding  undirected  graph  G 
=  (V,  E)  where  V  consists  of  the  1-valued  pixels  in  I  and  E 
consists  of  all  pairs  (i,  j)  where  i  and  j  are  adjacent  1-valued 
pixels  in  1.  The  connected  components  in  G  correspond  exactly 
to  the  connected  components  in  1.  As  a  result  of  this  corre¬ 
spondence,  graph  component  labeling  algorithms  can  be  used 
to  solve  the  image  component  labeling  problem. 

In  (|J),  Shiloach  and  Vishkin  present  a  PRAM  algorithm 
for  labeling  the  connected  components  of  an  undirected  graph 
containing  v  vertices  and  e  edges.  Their  algorithm  requires 
O(log  v)  time  on  a  CRCW  PRAM  of  size  v+2e.  The  type  of 
CRCW  PRAM  that  they  use  will  be  called  the  Arbitrary-CRCW 
PRAM.  In  this  model,  the  processor  that  succeeds  in  writing 
to  a  contested  memory  location  is  chosen  arbitrarily.  By  using 
the  correspondence  between  graph  and  image  component  la¬ 
beling  that  was  discussed  above,  it  is  clear  that  Shiloach  and 
Vishkin's  algorithm  can  be  used  to  obtain  an  algorithm  for 
labeling  the  connected  components  of  an  N1 2  x  N1'2  image  in 
0(log  N)  time  usingMn  Arbitrary-CRCW  PRAM  of  size  N. 

In  <10\  Nassimi  and  Sahni  present  an  algorithm  for  simulating 
a  PRAM  with  a  hypercube  or  shuffle-exchange  computer.  Their 
algorithm  simulates  a  single  operation  of  an  Arbitrary-CRCW 
PRAM  of  size  N  in  0(log2  N)  time  using  a  hypercube  or 
shuffle-exchange  of ^ize  N.  Using  this  simulation  and  the 
PRAM  algorithm  mentioned  above,  it  is  possible  to  obtain  an 
Oflog3  N)  time  algorithm  for  labeling  the  connected  components 
of  an  image  with  a  hypercube  or  shuffle-exchange  computer. 
The  algorithms  presented  in  this  paper  improve  upon  this  result 
by  requiring  only  Oflog2  N)  time.  They  are  the  first  Oflog2  X) 
time  algorithms  for  labeling  the  connected  components  of  an 
image  with  a  hypercube  or  shuffle-exchange  computer. 


2.-  Subroutines 

There  are  a  number  of  previously  published  algorithms  that 
arc  used  as  subroutines  in  the  current  paper.  This  section 
briefly  discusses  these  subroutines.  One  useful  subroutine  con¬ 
sists  of  the  Rank  and  Concentrate  algorithms  presented  in  h°). 
This  si.brcui-.  i,  .o  rule  uu.a  in  a  pi. i- jc  u. 


shuffle-exchange  computer.  The  input  is  a  set  of  R  records, 
stored  no  more  than  one  per  processor,  m  an  N  processor 
machine.  The  output  is  the  same  set  of  records,  now  stored 
one  per  processor,  in  the  first  R  processors.  Nassimi  ana  Sahni 
present  an  Oflog  N)  time  algorithm  for  this  operation.  A  slight 
generalization  of  this  problem  starts  with  R  records,  stored  no 
more  than  K  per  processor,  and  returns  with  the  records  stored 
in  the  first  ceiling(R/K)  processors,  again  having  no  more  than 
K  records  per  processor.  For  any  fixed  value  of  K,  this  gener¬ 
alized  version  of  the  problem  is  easily  solved  in  Oflog  N)  time 
by  simulating  a  KN  processor  machine  with  an  N  processor 
machine  and  running  Nassimi  and  Sahni's  algorithm  on  the 
simulated  machine.  This  generalized  algorithm  will  be  called 
the  Compress(K)  algorithm. 

The  paper  that  presents  the  Rank  and  Concentrate  algo¬ 
rithms  shows  how  a  single  operation  of  an  Arbitrary-CRCW 
PRAM  of  size  N  can  be  simulated  in  Oflog2  N)  time  by  a 
hypercube  or  shuffle-exchange  of  size  N.  The  simulation  algo¬ 
rithm  consists  of  an  Oflog2  N)  time  bitonic  sort  and  a  number 
of  Oflog  N)  time  routines.  Thus  the  Oflog2  N)  time  of  the 
simulation  algorithm  is  due  solely  to  the  time  required  for  the 
bitonic  sort. 

Another  important  subroutine  is  also  given  by  Nassimi  and 
Sahni.  In  ("),  they  present  algorithms  for  sorting  data  on 
hypercube  and  shuffle-exchange  computers.  For  any  fixed 
value  of  K  >  1,  the  least-significant-digit  radix  sort  algorithm 
that  they  present  sorts  N  integers,  each  of  which  is  in  the  range 
1  ..  N1  +  1  ,  in  Oflog  N)  time  using  a  hypercube  or  shuffle- 

exchange  of  size  N1  +  1  K.  This  algorithm  is  important  because 
it  can  be  used  to  simulate  a  single  operation  of  an  Arbitrary- 
CRCW  PRAM  of  size  N  with  a  hypercube  or  shuffle-exchange 
of  size  N,  +  I  K  in  Oflog  N)  time.  This  simulation  is  identical 
to  the  Oflog2  N)  time  simulation  presented  in  (10),  except  that 
the  bitonic  sort  is  replaced  by  the  least-sigmficant-digi:  radix 
sort.  This  Oflog  N)  time  simulation  algorithm  will  be  referred 
to  as  the  Fast_Simulation(K)  algorithm. 

The  other  subroutine  that  will  be  used  m  this  paper  is 
Shiloach  and  Vishkins  PRAM  algorithm  for  labeling  the  con¬ 
nected  components  of  a  graph  (>*).  As  was  mentioned  earlier, 
their  algorithm  uses  an  Arbitrary-CRCW  PRAM  of  size  v  —  2e 
to  label  the  connected  components  of  a  graph  containing  v 
vertices  and  c  edges  in  Oflog  v)  time.  In  order  to  dcscr.be  they 
algorithm,  a  few  terms  must  be  defined.  A  rooted  tree  is  a 
directed  graph  where:  1)  the  underlying  undirected  grapn  is  a 
tree,  and  2)  there  is  a  directed  path  from  each  vertex  in  the 
rioted  tree  lo  ike  root  vciicx.  A  rooted  Mat  is  a  "re 
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in  which  every  vertex  (including  the  root)  points  directly  to  the 


In  the  algorithm,  each  vertex  1  has  associated  with  it  a 
variable  D(i)  that  points  to  some  vertex  in  the  graph.  If  the 
pairs  (i,  D(i))  are  viewed  as  being  directed  edges,  then  the 
variables  D(i)  define  a  graph  that  is  called  the  pointer  graph. 
Throughout  the  algorithm,  the  pointer  graph  consists  of  a  for¬ 
est  of  rooted  trees.  At  the  start  of  the  algorithm,  D(i)  =  i  for 
all  i,  so  the  pointer  graph  consists  of  v  rooted  stars  each  con¬ 
taining  1  vertex.  The  algorithm  proceeds  by  performing  a 
number  of  "shortcut"  operations  that  reduce  the  heights  of  the 
rooted  trees  and  "hooking"  operations  that  merge  rooted  trees. 
At  the  end  of  the  algorithm,  the  pointer  graph  consists  of  a 
forest  of  rooted  stars,  where  each  star  contains  the  vertices  of 
one  of  the  connected  components  in  the  original  graph.  Thus 
the  D(i)  pointer  fields  can  be  considered  to  be  labels  that 
partition  the  graph  into  connected  components.  Because  each 
shortcut  and  hooking  operation  requires  constant  time,  and 
because  Shiloach  and  Vishkin  prove  that  only  0(log  v)  shortcut 
and  hooking  operations  are  required,  the  entire  algorithm  runs 
in  O(log  v)  time. 


3.-  The  Image  Component  Labeling 
Algorithms  ** 

Having  discussed  the  necessary  subroutines,  it  is  now  possible 
to  describe  the  0(log2  N)  time  hypercube  and  shuffle-exchange 
algorithms  for  image  component  labeling.  This  section  gives  a 
brief  description  of  the  algorithms.  This  description  is  accom¬ 
panied  by  an  exampM  that  is  presented  in  Figures  1-4.  Then 
a  more  detailed  description  and  a  proof  of  correctness  are  given. 
Finally,  an  analysis  of  the  running  time  is  presented. 

The  hypercube  "nd  shuffle-exchange  algorithms  are  very 
closely  related  to  one  another.  Both  use  N  processors  to  label 
the  connected  components  in  an  N12  x  N1  2  pixel  binary  image 
(see  Figure  1).  They  are  based  on  a  divide-and-conquer  tech¬ 
nique  where  the  N  pixel  image  is  divided  into  approximately 
N1  2  square  windows  each  containing  approximately  N‘  2  pixels. 
The  connected  components  within  the  windows  are  labeled 
using  a  recursive  call  (see  Figure  2). 

After  the  windows  have  been  labeled,  adjacent  I -valued 
pixels  have  the  same  label  unless  they  lie  on  the  borders  of 
.lift 'rent  wnndows.  The  next  'ask  is  to  correct  the  labels  of  the 
pixels  that  lie  on  the  borders  of  the  windows,  This  is  accom¬ 


plished  by  using  Shiloach  and  Vishkin  s  Otlog  Ni  tune  r  RAV 
algorithm  for  labeling  the  connected  components  a  ~apn 
This  gives  the  border  pixels  their  correct  labels  (see  Figure  ■ 
Then  the  non-border  pixels  are  relabeled  according  ::  the  -itjeti 
that  were  assigned  to  the  border  pixels.  This  is  me  uesirtc 
labeling  of  the  components  in  the  image  tsee  Figure  4).  "Tier, 
the  last  step  changes  the  labels  of  some  of  the  r.mDcreya 
This  step  is  required  in  order  to  make  the  recursive  ra_  fur.  cm  or 
correctly. 

The  details  of  the  image  component  labehr.g  a  ti: —  a re 
given  below.  It  is  assumed  that  N  =  4n.  Each  processor  .  has 
a  variable  D(i)  that  holds  the  current  label  of  its  r.tsl. 


If  N  =  1,  set  D(i)  :=  i.  Otherwise,  recursively  aoel  me  V 
x  M  windows  of  the  image  in  parallel,  where  V.  =  2~  ana 
m  =  ceiling(n  2).  This  step  sets  the  lariable  D  ■  in  aJl  o 
the  processors. 


•  Let  S  =  the  set  of  processors  that  are  on  the  barters  rf  the 
M  x  M  windows  and  contain  1 -valued  pixels.  Frr  eacrt  i  it 
S,  processor  i  creates  up  to  4  edge  records  as  (flows.  For 
each  j  in  S,  if  i  is  adjacent  to  j  and  if  i  and  ;  are  to  duTeren; 
M  x  M  windows,  then  processor  i  creates  the  reerra  <  u  ;> 
If  D(i)  #  i,  then  processor  i  creates  the  records  <  i,  D':)> 
and  <  D(i),  i  >  . 


The  edge  records  are  placed  in  the  first  N  M  processors.  with 
each  processor  holding  at  most  12  edge  records.  The  C  om- 
press(12)  subroutine  is  used  to  accompash  this. 


•  Shiloach  and  Vishkin' s  Of  log  N)  time  PRAM  algrnthrm  ‘,4 
is  used  to  label  the  connected  components  of  the  gnpr.  rep- 
resented  by  the  edge  records  created  a  Step  2  The  sag: 
records  form  the  input  vector  E  ar.d  the  proctssors  m  5 
correspond  to  the  PRAM  processors  used  ta  the  ugcrrmm. 
Notice  that  there  are  at  most  12N3  4  edge  record.'  ind  -  V  * 
processors  in  S.  The  Fast_Simulatci •  algorithm  .s  usee  it 
simulate  at;  Arbitrary-CRC’.V  PRAM  st/e  O-N  tr.  «. 
hypercube  or  shuffle-exchange  of  si/e  V  This  r.-n  re-i-neu 
the  border  pixels. 
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Step  5: 

•  All  processors  (including  those  not  in  S)  set  D(i)  :=  D(D(i)). 
At  this  point  each  connected  component  in  the  N1  2  x  N'1  * 
image  is  represented  by  a  rooted  star.  This  step  relabels  the 
non-border  pixels  according  to  the  labels  that  were  assigned 
to  the  border  pixels  in  Step  4. 

Step  6: 

•  Let  T  =  the  set  of  processors  that  are  on  the  border  of  the 
\1'2  x  N1  2  image.  For  each  i  in  T,  processor  i  attempts  to 
set  D(D(i))  :  =  i.  For  each  variable  D(D(i))  being  written 
to,  it  is  assumed  that  one  (arbitrarily  selected)  write  attempt 
succeeds.  Then  all  processors  set  D(i)  :  =  D(D(i)).  This  step 
assures  that  the  recursive  call  will  work  correctly. 

After  Step  1,  any  pair  of  pixels  that  have  the  same  label  are 
in  fact  in  the  same  connected  component.  However,  it  is 
possible  that  more  than  one  label  has  been  assigned  to  a  single 
connected  component  (this  happens  whenever  a  component 
appears  in  more  than  one  window).  The  purpose  of  Steps  2-5 
is  to  relabel  pixels  so  that  only  1  label  is  assigned  to  each 
connected  component.  It  will  be  shown  that  Steps  2-5  do  in 
fact  accomplish  this. 

Let  Label(i,  k)  be*the  label  assigned  to  pixel  i  after  Step  k. 
Let  G  be  the  undirected  graph  created  in  Steps  2  and  3.  Note 
that  G  =  (V,  E)  where  V  =  S  and  E  is  the  set  of  all  unordered 
pairs  (i,  j)  wnere  i  and  j  are  in  S,  and  either  j  =  Label(i,  1)  or 
else  i  and  j  are  adjacent  and  located  in  different  M  x  M  windows. 

Claim  1:  If  Stepil  correctly  labels  the  M  x  M  windows, 
then  for  -11  i,j  in  S,  Label(i,  4)  =  Labe!(j,  4)  if  and  only  if  i 
and  j  are  in  the  same  connected  component  in  the  image. 

Proof  (Omitted) 

Claim  2:  If  Step  1  correctly  labels  the  M  x  M  windows, 
then  for  all  1 -valued  pixels  i  and  j,  Label(i,  5)  =  Labe!(j,  5)  if 
and  only  if  i  and  j  are  in  the  same  connected  component  in 
the  image. 

Proof:  (Omitted) 

Claim  3:  For  all  1-valued  pixels  i  and  j,  Labe!(i,  5)  = 
LabeKj,  5)  if  and  only  if  i  and  j  arc  in  the  same  connected 
component  m  the  image. 


Proof:  (Omitted) 

Claim  3  shows  that  the  image  is  correctly  labeler  liter  Ster 
5.  Because  Step  6  only  relabels  some  of  the  components.  anc 
no  two  of  the  components  are  given  the  same  new  abei_  the 
image  is  also  correctly  labeled  after  Step  b.  Step  6  insures  that 
all  of  the  image  components  that  contain  an  element  of  —  are 
rooted  in  T.  This  is  necessary  so  that  when  the  affnrui — i  is 
called  recursively  in  Step  I,  the  labeis  that  are  assignee  guarantee 
that  in  Step  2,  if  i  is  in  S  then  D(i)  is  also  in  S. 

Step  2  requires  constant  time.  Step  3  requires  Oilcg  \ 
time.  Step  4  consists  of  O(log  N)  PRAM  steps,  eacn  of  wnich 
is  simulated  in  0(log  N)  time,  so  Step  4  requires  Dios'  N'i 
time.  Steps  5  and  6  can  each  be  considered  to  be  a  smgle 
operation  of  an  Arbitrary-CRCW  PRAM  of  size  N.  and  can 
be  simulated  in  0(log2  N)  time  i'°>.  The  total  time  for  rteps 
2  through  6  is  thus  less  than  or  equal  to  Clog2  N  for  some 
constant  C,  and  the  total  time  for  the  algorithm  is  thus  OGog" 
N). 


4.-  Conclusion 

This  paper  has  shown  that  0(log2  N)  time  image  cnrnocnent 
labeling  algorithms  are  possible  on  the  hypercube  a nu  shuhHe- 
exchange  computers.  The  algorithms  that  are  presented  tnuxxe 
use  of  a  divide-and-conquer  strategy.  The  image  is  dimded  nto 
windows  that  are  labeled  recursively,  and  the  results  of  torse 
labelings  are  then  combined  to  obtain  the  final  rer_t.  ~ae 
combination  step  uses  a  PRAM  algorithm  for  labeung  the 
connected  components  of  a  graph.  The  key  to  the  ?.  ;m: — is 
is  the  reduction  of  the  amount  of  data  that  must  he  rented 
during  this  combination  step.  This  reduction  is  due  ::  the  fan 
that  only  the  borders  of  the  windows  need  to  be  ccmdertc. 

These  algorithms  demonstrate  that  PRAM  algor.trms  can 
be  very  helpful  in  designing  fast  algorithms  for  realist:;  pamhel 
machines,  but  that  they  must  be  used  carefully.  A  straightfor¬ 
ward  approach  would  simply  simulate  the  PRAM  _  zor.tmn 
on  the  hypercube  or  shuffle-exchange  computer.  This  lepre-ach 
yields  0(log3  N)  time  algorithms.  By  using  a  divide-anu-cer.c  uer 
approach  and  only  simulating  the  PRAM  algorithm,  wr.er.  me 
amount  of  data  to  be  processed  is  reduced.  O-  log"  \  ■  me 
algorithms  are  obtained. 


Bibliography 


l'. 


1.  R  Cypher,  J.  L.  C.  Sanz,  L.  Snyder,  "EREW  PRAM 
and  Mesh  Connected  Computer  Algorithms  for  Image 
Component  Labeling",  to  appear  in  1987  IEEE  Com¬ 
puter  Society  Workshop  on  Computer  Architecture, 
Pattern  Analysis  and  Machine  Intelligence. 

2.  R.  Cypher,  J.  L.  C.  Sanz,  L.  Snyder,  "Algorithms  for 
Image  Component  Labeling  on  SIMD  Mesh  Con¬ 
nected  Computers",  to  appear  in  Proc.  1987  Inti.  Con¬ 
ference  on  Parallel  Processing. 

3.  R.  Hummel,  "Connected  Component  Labelling  in  Im¬ 
age  Processing  with  MIMD  Architectures'  in 
Intermediate- Level  Image  Processing,  Academic  Press, 
1986,  pp.  101-127. 

4.  R.  Hummel,  A.  Rojer,  "Implementing  a  Parallel  Con¬ 
nected  Component  Algorithm  on  MIMD  Architec¬ 
tures',  IEEE  Computer  Society  Workshop  on  Com¬ 
puter  Architecture  for  Pattern  Analysis  and  Image 
Data  Base  Management,  Miami,  Florida,  1985. 

5.  Y.  Hung,  A.  Rosenfeld,  "Parallel  Processing  of  Linear 
Quadtrees  on  a  Mesh-Connected  Computer",  Tech. 
Rep.  CAR-TR-278,  Center  for  Automation  Research, 
U.  of  Maryland,  March  1987. 

6.  V.  K.  P.  Kumar,  M.  M.  Eshaghian,  "Parallel Geometric 
Algorithms  for  Digitized  Pictures  on  Mesh  of  Trees' 
(preliminary  version),  Proc.  1986  Inti.  Conference  on 
Parallel  Processing,  pp.  270-273. 

7.  W.  Lim,  "Fast  algorithms  for  labeling  connected  com¬ 
ponents  in  2-D  arrays",  Tech.  Rep.  86.22,  Thinking 
Machines  Corp.,  Cambridge,  Mass.,  July  1986. 

8.  R  Miller.  Q.  Stout,  "Varying  Diameter  and  Problem 
Size  in  Mesh-Connected  Computers"  (preliminary  ver¬ 
sion),  Proc.  1985  Inti.  Conference  on  Parallel  Process¬ 
ing,  pp.  697-699. 


10.  D.  Xassimi,  S.  Sahni,  "Data  Broadcasting  in  SIMD 
Computers",  IEEE  Transactions  on  Computers.  ••  oi. 
c-30,  no.  2,  February  1981,  pp.  101-107. 

11.  D.  Xassimi,  S,  Sahni,  "Parallel  Permutation  and  Soring 
Algorithms  and  a  X'ew  Generalized  Connection  Net¬ 
work".  Journal  of  the  ACM,  Vol.  29,  Xo.3,  July  1  -  82. 
pp.  642-667. 

12.  A.  Rosenfeld,  "Parallel  Image  Processing  Using  Cel¬ 
lular  Arrays',  IEEE  Computer,  pp.  14-20,  1983. 

13.  A.  Rosenfeld,  A.  Kak,  Digital  Picture  Processing.  Ac¬ 
ademic  Press,  vols.  1-2,  1982. 

14.  Y.  Shiloach,  U.  Vishkin,  "An  0(log  n)  Parallel  Con¬ 
nectivity  Algorithm",  Journal  of  Algorithms,  vol  3, 
1982,  pp.  57-67. 

15.  H.  S.  Stone,  'Parallel  Processing  with  the  Perfect  Shuf¬ 
fle",  IEEE  Transaction  on  Computers,  vol.  c-20.  no. 
2,  February  1971,  pp.  153-161. 

16.  Q.  F.  Stout,  'Properties  of  Divide-and-Conquer  AAo- 
rithms  for  Image  Processing",  1985  IEEE  Comr-'.er 
Society  Workshop  on  Computer  Architecture  for  ?et- 
tem  Analysis  and  Image  Database  Management,  rp. 
203-209,  1985. 

17.  S.  Tanimoto,  "Architectural  Issues  for  Intermediaie- 
Level  Vision"  in  Intermediate- Level  Image  Processing, 
Academic  Press,  1986,  pp.  3-16. 


A  cknowledgement 

The  work  of  R.  Cypher  was  supported  in  part  by  a  X  mortal 
Science  Foundation  Fellowship  and  the  work  of  L.  Sr/,  her  was 
supported  in  pan  by  the  Xational  Science  Foundation  Grant 
DCR  8416878. 


D.  Nassirru,  S.  Sahni,  "Finding  Connected  Components 
and  Connected  Ones  on  a  Mesh-Connected  Parallel 
Computer",  Siam  J.  Comput.,  vol.  9,  no. 4,  Xovembcr 
1980,  pp. 744-757. 


